Divergence and Shannon information in genomes.
نویسندگان
چکیده
Shannon information (SI) and its special case, divergence, are defined for a DNA sequence in terms of probabilities of chemical words in the sequence and are computed for a set of complete genomes highly diverse in length and composition. We find the following: SI (but not divergence) is inversely proportional to sequence length for a random sequence but is length independent for genomes; the genomic SI is always greater and, for shorter words and longer sequences, hundreds to thousands times greater than the SI in a random sequence whose length and composition match those of the genome; genomic SIs appear to have word-length dependent universal values. The universality is inferred to be an evolution footprint of a universal mode for genome growth.
منابع مشابه
A Goodness of Fit Test For Exponentiality Based on Lin-Wong Information
In this paper, we introduce a goodness of fit test for expo- nentiality based on Lin-Wong divergence measure. In order to estimate the divergence, we use a method similar to Vasicek’s method for estimat- ing the Shannon entropy. The critical values and the powers of the test are computed by Monte Carlo simulation. It is shown that the proposed test are competitive with other tests of exponentia...
متن کاملJensen divergence based on Fisher's information
The measure of Jensen-Fisher divergence between probability distributions is introduced and its theoretical grounds set up. This quantity, in contrast to the remaining Jensen divergences, is very sensitive to the fluctuations of the probability distributions because it is controlled by the (local) Fisher information, which is a gradient functional of the distribution. So, it is appropriate and ...
متن کاملA family of statistical symmetric divergences based on Jensen's inequality
We introduce a novel parametric family of symmetric information-theoretic distances based on Jensen’s inequality for a convex functional generator. In particular, this family unifies the celebrated Jeffreys divergence with the Jensen-Shannon divergence when the Shannon entropy generator is chosen. We then design a generic algorithm to compute the unique centroid defined as the minimum average d...
متن کاملBounds on Non-Symmetric Divergence Measures in Terms of Symmetric Divergence Measures
There are many information and divergence measures exist in the literature on information theory and statistics. The most famous among them are Kullback-Leibler [13] relative information and Jeffreys [12] Jdivergence. Sibson [17] Jensen-Shannon divergence has also found its applications in the literature. The author [20] studied a new divergence measures based on arithmetic and geometric means....
متن کاملUniversal Lengths in Microbial Genomes and Implication for Early Genome Growth
We report the discovery of a set of universal lengths that characterize all microbial complete genomes. The Shannon information [Shannon 1948] of 108 complete microbial genomes relative to those of their respective randomized counterparts are computed and the results are summarized in a two-parameter exponential relation: Lr(k) = (42± 21)× 2.64, 2 ≥ k ≥ 10, where Lr is a ”root-sequence length” ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Physical review letters
دوره 94 17 شماره
صفحات -
تاریخ انتشار 2005